fix: anchor/alternation PIKEVM routing + B5/B12 backref support#82
Conversation
Eliminates B10/B15 FallbackPatternDetector predicates and partially
eliminates B16 by routing the affected DFA_*_WITH_GROUPS patterns to
PIKEVM_CAPTURE before the DFA state-count ladder:
- B10: optional prefix before capturing group (e.g. -?(-?.{3}).)
- B15: capturing group in quantified alternation (e.g. (a|b){2,})
- B16 (partial): nullable outer quantifier on capturing group with
non-nullable content (e.g. (a)?); patterns where both the outer
quantifier and group content are nullable (e.g. (0*-?){0,}) still
fall back to JDK via the new hasNullableGroupContentWithNullableQuantifier
predicate.
Both the capture-ambiguous TDFA path and the non-ambiguous DFA-with-groups
path now have the three gates before the DFA strategy ladder. Fuzz gate:
findings=0 (9530 patterns, 76240 inputs).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add PIKEVM gate inside the capturing TDFA isAnchorConditionDiluted() block: patterns where both branches share a leading character but one branch carries a start-anchor guard (e.g. ^x|x(y)) now route to PIKEVM_CAPTURE instead of the JDK fallback. PikeVM evaluates ^/\A correctly against the search-region origin since commit 0acfc66. Patterns with optional quantifiers, nullable branches, or leading end-anchors still fall through to the anchorConditionDiluted JDK path. Fuzz gate confirms zero divergences with the new routing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- NFABytecodeGenerator: add zero-length early-accept before bounds/regionMatches in generateBackreferenceCheck; groupLen==0 trivially succeeds (vacuous match) - FallbackPatternDetector: replace broad hasNullableBackrefGroup B7 guard with narrowed hasAmbiguouslyNullableBackrefGroup that only falls back when the group body can capture strings of length > 1 (unbounded contamination risk); groups with max capture length <= 1 (e.g. a?, [x]?) are safe with the early-accept - BackrefEngineGapsTest: enable b7_nullableBackrefGroupInOptimizedNfa Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- RuntimeCompiler: replace CapturePolicy import with ReggieOption/UnsupportedPatternException - Add cacheKeyFor() helper (flag-aware cache key) and fallbackOrThrow() helper - Gate all 6 JavaRegexFallbackMatcher construction sites behind ALLOW_JDK_FALLBACK flag - compileHybrid() receives ReggieOptions to propagate fallback policy - UnsupportedPatternException propagates through catch(Exception) via explicit re-throw - 34 test files updated: add allowJdkFallback() for patterns requiring JDK fallback - New FallbackPolicyTest: throwsByDefault, delegatesWhenFallbackEnabled, nativePatternUnaffected Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…otations ReggieOption moved from reggie-runtime to reggie-annotations so the annotation type can reference it without a circular dependency.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…onPriorityConflict
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0b9dc69e9a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…nt==0 branch - FallbackPatternDetector.isPrefixNodeHandleable: reject unbounded quantifiers (max==-1); greedy prefix loop commits without backtracking so a*(a+)\1 on "aa" would fail natively. Routes to fallback engine instead. - FallbackPatternDetector.hasStringEndAnchorInAltHelper: unwrap non-capturing groups before the AnchorNode check so (?:\Z)|abc is treated as a pure-anchor branch (same as bare \Z|abc), preventing unnecessary OPTIMIZED_NFA fallback. - PatternAnalyzer: remove dead nfa.getGroupCount()==0 branch inside the nfa.getGroupCount()>0 guard block; zero-group patterns handled outside this block. - Add regression tests for the above in BackrefEngineGapsTest and AnchorAlternationPikeVMTest. - StrategyCorrectnessMetaTest: clarify OPTIMIZED_NFA representative is JDK-fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a76989652b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…BACKREF Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
1 similar comment
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4b617a7a1f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 178413f070
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c02d9bfa3d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e9fda8c64c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
All bytecode generators now handle the full Java \$/\Z terminator set: lone \n (with CRLF guard), lone \r, \r\n pair at end-2, NEL, LS, PS.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 99d6ed726e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
…ce; fix enforced gate budget default
What does this PR do?
Routes anchor-diluted capturing alternations and quantified-group alternation-priority conflicts to
PIKEVM_CAPTURE(the priority-correct thread engine) ahead of the DFA paths, with acompileHybridpre-check and a revert of over-broad PikeVM promotions. Adds backref fixes: B5 (guard lazy quantifiers inVARIABLE_CAPTURE_BACKREF— throw, not silent wrong), B7 (zero-length early-accept for nullable groups), B12 (quantifier-prefix backref bytecode +isPrefixNodeHandleablefor unbounded/exact prefixes).Motivation
Correct captures/matches for pattern classes the DFA strategies mishandled; make lazy-backref gaps fail loudly instead of silently wrong.
Related Issue(s)
Stacked on PR1. Part of the 2026-06 capture-correctness & performance effort.
Improves the safety of (does not close) #33 (alternation-priority decline → fallback) and #37 (B5 lazy-quantifier guard → throws instead of producing wrong captures).
Change Type
Checklist
./gradlew build)Performance Impact
None (routing/correctness).
Additional Notes
Stacked on
pr/1-reggieoption-fallback-substrate. Includes a large docs-only commit (dfd070a, plan files). Contains worktree-agent merge commits. Cuts at1133d2a.🤖 Generated with Claude Code